Machine-Assisted Rhetorical Structure Annotation
نویسندگان
چکیده
Manually annotating the rhetorical structure of texts is very labour-intensive. At the same time, high-quality automatic analysis is currently out of reach. We thus propose to split the manual annotation in two phases: the simpler marking of lexical connectives and their relations, and the more difficult decisions on overall tree structure. To this end, we developed an environment of two analysis tools and XML-based declarative resources. Our ConAno tool allows for efficient, interactive annotation of connectives, scopes and relations. This intermediate result is exported to O’Donnell’s ‘RST Tool’, which facilitates completing the tree structure.
منابع مشابه
The HOLJ Corpus: Supporting Summarisation Of Legal Texts
We describe an XML-encoded corpus of texts in the legal domain which was gathered for an automatic summarisation project. We describe two distinct layers of annotation: manual annotation of the rhetorical status of sentences and an entirely automatic annotation process incorporating a host of individual linguistic processors. The manual rhetorical status annotation has been developed as trainin...
متن کاملSimple Signals for Complex Rhetorics: On Rhetorical Analysis with Rich-Feature Support Vector Models
Most text displays an internal coherence structure, which can be analyzed as a tree structure of relations that hold between short segments of text. We present a machine-learning governed approach to such an analysis in the framework of Rhetorical Structure Theory. Our rhetorical analyzer observes a variety of textual properties, such as cue phrases, part-of-speech information, rhetorical conte...
متن کاملBook Review: The Structure of Scientific Articles: Applications to Citation Indexing and Summarization by Simone Teufel
Discourse models have received significant attention in the computational linguistics community with some important connections to the non-computational discourse community. More recently, the importance of discourse annotation has increased as models generated with supervised machine learning techniques are being used to annotate text automatically. A primary area for annotation is science. Th...
متن کاملThe annotation of the Central Unit in Rhetorical Structure Trees: A Key Step in Annotating Rhetorical Relations
This article aims to analyze how agreement regarding the central unit (macrostructure) influences agreement when establishing rhetorical relations (microstructure). To do so, the authors conducted an empirical study of abstracts from research articles in three domains (medicine, terminology, and science) in the framework of Rhetorical Structure Theory (RST). The results help to establish a new ...
متن کاملThe RST Basque TreeBank: an online search interface to check rhetorical relations
This paper introduces the first Basque discourse TreeBank annotated with rhetorical relations following Rhetorical Structure Theory. We report the main features of the corpus, such as the annotation criteria, inter-annotator agreement and harmonization procedure. We describe an online search system to check the annotation of discourse relations.
متن کامل